Significantly Different Textures: A Computational Model of Pre-attentive Texture Segmentation
نویسنده
چکیده
Recent human vision research [1] suggests modelling preattentive texture segmentation by taking a set of feature samples from a local region on each side of a hypothesized edge, and then performing standard statistical tests to determine if the two samples differ significantly in their mean or variance. If the difference is significant at a specified level of confidence, a human observer will tend to pre-attentively see a texture edge at that location. I present an algorithm based upon these results, with a well specified decision stage and intuitive, easily fit parameters. Previous models of pre-attentive texture segmentation have poorly specified decision stages, more unknown free parameters, and in some cases incorrectly model human performance. The algorithm uses heuristics for guessing the orientation of a texture edge at a given location, thus improving computational efficiency by performing the statistical tests at only one orientation for each spatial location. 1 Pre-attentive Texture Segmentation Pre-attentive texture segmentation refers to the phenomenon in human vision in which two regions of texture quickly (i.e. in less than 250 ms), and effortlessly segregate. Observers may perceive a boundary or edge between the two regions. In computer vision, we would like to find semantically meaningful boundaries between different textures. One way of estimating these boundaries is to find boundaries that would be found by a human observer. The boundaries thus defined should be sufficient for most computer vision applications. Whether a human observer can distinguish two textures depends upon whether the discrimination is preattentive or attentive. The experimental literature tells us far more about pre-attentive segmentation than attentive discrimination. Researchers have suggested both featureand filter-based models of pre-attentive texture segmentation. Many of the feature-based models have been statistical in nature. Julesz [2] suggested that pre-attentive segmentation is determined by differences in the 2nd-order statistics of the texture, or differences in the 1st-order statistics of "textons" such as line terminators and corners [3]. Beck, Prazdny, & Rosenfeld [4] suggested that texture segmentation is based upon differences in the first-order statistics of stimulus features such as orientation, size, and contrast. However, these theories do not indicate how these differences might be quantified, or what properties of the statistics might be used. Furthermore, such models have not typically been implemented such that they could be tested on actual images. Filter-based models [e.g. 5, 6, 7, 8] have suggested that texture segmentation is determined by the responses of spatial-frequency channels, where the channels contain both linear filtering mechanisms and various non-linearities. Malik & Perona’s model [7] provides the most developed example of this type of model. It involves linear bandpass filtering, followed by half-wave rectification, non-linear inhibition and excitation among channels and among neighboring spatial locations, filtering with large-scale Gaussian first derivative filters, and a decision based upon the maximum response from the final filtering stage. These models often contain many unknown parameters. What weights specify the inhibition and excitation among the different filter responses? What scale should one use for the Gaussian 1stderivative filters? Perhaps most importantly, such models often contain an arbitrary, unspecified, threshold for determining the existence of a perceived edge between two textures. Many of these filter-based models are notoriously vague about the final decision stage. Furthermore, such models don’t give us much insight into which textures will segment, since the comparison carried out by the model is often obscured by the details of the filtering, non-linearities, and image-based decision stage. What is the meaning of the texture gradient computed in the penultimate stage of the Malik & Perona model? This paper describes a working texture segmentation algorithm that mimics human pre-attentive texture segmentation. Section 2 reviews recent human vision experiments [1], which were aimed at studying what first-order statistics determine texture segmentation. These results suggest that modelling pre-attentive texture segmentation by standard statistical tests for a difference in mean and standard deviation of various features such as orientation and contrast. Section 3 reviews previous models of pre-attentive texture segmentation in light of these results, and discusses the relationship to other edge detection and image segmentation algorithms. Section 4, presents a biologically plausible, filter-based algorithm based upon the experimental results in Section 2 and those of Kingdom & Keeble [9]. Section 5 presents results of this algorithm on artificial and natural images. 2 Recent Experimental Results in Pre-attentive Texture Segmentation In [1], I studied segmentation of orientation-defined textures such as those shown in Figure 1. In each of the three experiments, observers viewed each texture pair for 250 ms, and the task was to indicate whether the boundary between the two textures fell to the left or right of the center of the display. If, as Beck et al [4] suggested, texture segmentation of orientation-defined textures is based upon differences in the 1st-order statistics of orientation, to what 1st-order statistics does this refer? If the difference in mean orientation is the crucial quantity, two textures should segment if this difference lies above a certain threshold, independent of other properties of the orientation distributions. A more plausible possibility is that the determining factor is the significance of the difference in mean orientations. The significance of the difference takes into account the variability of the textures, so that two homogeneous textures with means differing by 30 degrees may segment, while two heterogeneous textures with the same difference in mean may not. Perhaps observers can also segment two textures that differ only in their variability. Other parameters of the distribution might also be relevant, such as the skew or kurtosis. Alternatively, observers might be able to segment two textures given any sufficiently large difference in their first-order statistics. The first experiment asked observers to segment two textures that differed only in their mean orientation. Each texture had orientations drawn from a wrapped normal distribution [10]. The experiment determined the threshold difference in mean orientation, at which observers can correctly localize the texture boundary 82% of the time, for 4 different values of the orientation standard deviation. Figure 2a shows the results. Clearly observers can segment two textures differing only in their mean orientation. Furthermore, the difference in mean required to perform the segmentation task depends upon the standard deviation. The second experiment determined whether or not observers could pre-attentively segment textures that differed only in the variance of their orientation distributions. For two possible baseline standard deviations, the experiment measured the threshold increment in standard deviation at which observers could correctly localize the texture boundary 82% of the time. Observers could segment textures differing only in their variance, and Figure 2b shows the thresholds found. The difference in variance required depends upon the baseline standard deviation. The third experiment tested segmentation of a unimodal wrapped-normal distribution from a discrete, bimodal distribution with the same mean orientation and variance. This experiment measured percent correct performance, for 4 possible spacings of the modes of the bimodal distribution. The results are shown in Figure 2c. All observers (a) (b) (c) (d) strength = 5.6 strength = 4.8 strength = 2.6 strength = 3.3
منابع مشابه
Texture synthesis and perception: Using computational models to study texture representations in the human visual system
Traditionally, texture perception has been studied using artificial textures made of random dots or repeated shapes. At the same time, computer algorithms for natural texture synthesis have improved dramatically. We seek to unify these two fields through a psychophysical assessment of a particular computational model, providing insight into which statistics are most vital for natural texture pe...
متن کاملUsing Computational Models to Study Texture Representations in the Human Visual System
Traditionally, human texture perception has been studied using artificial textures made of random-dot patterns or abstract structured elements. At the same time, computer algorithms for the synthesis of natural textures have improved dramatically. The current study seeks to unify these two fields of research through a psychophysical assessment of a particular computational model, thus providing...
متن کاملUnsupervised Texture Image Segmentation Using MRFEM Framework
Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...
متن کاملUnsupervised Texture Image Segmentation Using MRFEM Framework
Texture image analysis is one of the most important working realms of image processing in medical sciences and industry. Up to present, different approaches have been proposed for segmentation of texture images. In this paper, we offered unsupervised texture image segmentation based on Markov Random Field (MRF) model. First, we used Gabor filter with different parameters’ (frequency, orientatio...
متن کاملPre-attentive segmentation in the primary visual cortex.
The activities of neurons in primary visual cortex have been shown to be significantly influenced by stimuli outside their classical receptive fields. We propose that these contextual influences serve pre-attentive visual segmentation by causing relatively higher neural responses to important or conspicuous image locations, making them more salient for perceptual pop-out. These locations includ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000